Goto

Collaborating Authors

 contact information


Jeffrey Epstein's Ties to CBP Agents Sparked a DOJ Probe

WIRED

Documents say customs officers in the US Virgin Islands had friendly relationships with Epstein years after his 2008 conviction, showing how the infamous sex offender tried to cultivate allies. United States prosecutors and federal law enforcement spent over a year examining ties between Jeffrey Epstein and Customs and Border Protection officers stationed in the US Virgin Islands (USVI), according to documents recently released by the Department of Justice. As The Guardian and New York Times have reported, emails, text messages, and investigative records show that Epstein cultivated friendships with several officers, entertaining them on his island and offering to take them for whale-watching trips in his helicopter. He even brought one cannolis for Christmas Eve. In turn, Epstein would bring certain officers his complaints about his treatment at the hands of other CBP and federal agents.


Identity Theft in AI Conference Peer Review

Communications of the ACM

Academia heavily relies on trust. This trust-based system, however, creates a significant vulnerability: identity theft. In this Opinion column, we describe newly uncovered cases of identity theft within the scientific peer-review process within the research area of artificial intelligence (AI), involving modus operandi that could also disrupt other academic procedures. We begin by outlining the peer-review process, focusing on scientific conferences since they are the most prominent venues of publication in computer science. Peer review is foundational to scientific inquiry, relying on researchers to voluntarily apply their expertise in evaluating scientific papers.


ZeroDexGrasp: Zero-Shot Task-Oriented Dexterous Grasp Synthesis with Prompt-Based Multi-Stage Semantic Reasoning

arXiv.org Artificial Intelligence

Task-oriented dexterous grasping holds broad application prospects in robotic manipulation and human-object interaction. However, most existing methods still struggle to generalize across diverse objects and task instructions, as they heavily rely on costly labeled data to ensure task-specific semantic alignment. In this study, we propose \textbf{ZeroDexGrasp}, a zero-shot task-oriented dexterous grasp synthesis framework integrating Multimodal Large Language Models with grasp refinement to generate human-like grasp poses that are well aligned with specific task objectives and object affordances. Specifically, ZeroDexGrasp employs prompt-based multi-stage semantic reasoning to infer initial grasp configurations and object contact information from task and object semantics, then exploits contact-guided grasp optimization to refine these poses for physical feasibility and task alignment. Experimental results demonstrate that ZeroDexGrasp enables high-quality zero-shot dexterous grasping on diverse unseen object categories and complex task requirements, advancing toward more generalizable and intelligent robotic grasping.


Identity Theft in AI Conference Peer Review

arXiv.org Artificial Intelligence

Abstract: We discuss newly uncovered cases of identity theft in the scientific peer-review process within artificial intelligence (AI) research, with broader implications for other academic procedures. We detail how dishonest researchers exploit the peer-review system by creating fraudulent reviewer profiles to manipulate paper evaluations, leveraging weaknesses in reviewer recruitment workflows and identity verification processes. The findings highlight the critical need for stronger safeguards against identity theft in peer review and academia at large, and to this end, we also propose mitigating strategies. Academia heavily relies on trust. This trust-based system, however, creates a significant vulnerability: identity theft.


A major AI training data set contains millions of examples of personal data

MIT Technology Review

The bottom line, says William Agnew, a postdoctoral fellow in AI ethics at Carnegie Mellon University and one of the coauthors, is that "anything you put online can [be] and probably has been scraped." The researchers found thousands of instances of validated identity documents--including images of credit cards, driver's licenses, passports, and birth certificates--as well as over 800 validated job application documents (including résumés and cover letters), which were confirmed through LinkedIn and other web searches as being associated with real people. A number of the résumés disclosed sensitive information including disability status, the results of background checks, birth dates and birthplaces of dependents, and race. When résumés were linked to people with online presences, researchers also found contact information, government identifiers, sociodemographic information, face photographs, home addresses, and the contact information of other people (like references). When it was released in 2023, DataComp CommonPool, with its 12.8 billion data samples, was the largest existing data set of publicly available image-text pairs, which are often used to train generative text-to-image models.


GelFusion: Enhancing Robotic Manipulation under Visual Constraints via Visuotactile Fusion

arXiv.org Artificial Intelligence

Visuotactile sensing offers rich contact information that can help mitigate performance bottlenecks in imitation learning, particularly under vision-limited conditions, such as ambiguous visual cues or occlusions. Effectively fusing visual and visuotactile modalities, however, presents ongoing challenges. We introduce GelFusion, a framework designed to enhance policies by integrating visuotactile feedback, specifically from high-resolution GelSight sensors. GelFusion using a vision-dominated cross-attention fusion mechanism incorporates visuotactile information into policy learning. To better provide rich contact information, the framework's core component is our dual-channel visuotactile feature representation, simultaneously leveraging both texture-geometric and dynamic interaction features. We evaluated GelFusion on three contact-rich tasks: surface wiping, peg insertion, and fragile object pick-and-place. Outperforming baselines, GelFusion shows the value of its structure in improving the success rate of policy learning.


ContactFusion: Stochastic Poisson Surface Maps from Visual and Contact Sensing

arXiv.org Artificial Intelligence

Robust and precise robotic assembly entails insertion of constituent components. Insertion success is hindered when noise in scene understanding exceeds tolerance limits, especially when fabricated with tight tolerances. In this work, we propose ContactFusion which combines global mapping with local contact information, fusing point clouds with force sensing. Our method entails a Rejection Sampling based contact occupancy sensing procedure which estimates contact locations on the end-effector from Force/Torque sensing at the wrist. We demonstrate how to fuse contact with visual information into a Stochastic Poisson Surface Map (SPSMap) - a map representation that can be updated with the Stochastic Poisson Surface Reconstruction (SPSR) algorithm. We first validate the contact occupancy sensor in simulation and show its ability to detect the contact location on the robot from force sensing information. Then, we evaluate our method in a peg-in-hole task, demonstrating an improvement in the hole pose estimate with the fusion of the contact information with the SPSMap.


Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation

arXiv.org Artificial Intelligence

Multi-hand semantic grasp generation aims to generate feasible and semantically appropriate grasp poses for different robotic hands based on natural language instructions. Although the task is highly valuable, due to the lack of multi-hand grasp datasets with fine-grained contact description between robotic hands and objects, it is still a long-standing difficult task. In this paper, we present Multi-GraspSet, the first large-scale multi-hand grasp dataset with automatically contact annotations. Based on Multi-GraspSet, we propose Multi-GraspLLM, a unified language-guided grasp generation framework. It leverages large language models (LLM) to handle variable-length sequences, generating grasp poses for diverse robotic hands in a single unified architecture. Multi-GraspLLM first aligns the encoded point cloud features and text features into a unified semantic space. It then generates grasp bin tokens which are subsequently converted into grasp pose for each robotic hand via hand-aware linear mapping. The experimental results demonstrate that our approach significantly outperforms existing methods on Multi-GraspSet. More information can be found on our project page https://multi-graspllm.github.io.


A Learning-based Controller for Multi-Contact Grasps on Unknown Objects with a Dexterous Hand

arXiv.org Artificial Intelligence

Existing grasp controllers usually either only support finger-tip grasps or need explicit configuration of the inner forces. We propose a novel grasp controller that supports arbitrary grasp types, including power grasps with multi-contacts, while operating self-contained on before unseen objects. No detailed contact information is needed, but only a rough 3D model, e.g., reconstructed from a single depth image. First, the external wrench being applied to the object is estimated by using the measured torques at the joints. Then, the torques necessary to counteract the estimated wrench while keeping the object at its initial pose are predicted. The torques are commanded via desired joint angles to an underlying joint-level impedance controller. To reach real-time performance, we propose a learning-based approach that is based on a wrench estimator- and a torque predictor neural network. Both networks are trained in a supervised fashion using data generated via the analytical formulation of the controller. In an extensive simulation-based evaluation, we show that our controller is able to keep 83.1% of the tested grasps stable when applying external wrenches with up to 10N. At the same time, we outperform the two tested baselines by being more efficient and inducing less involuntary object movement. Finally, we show that the controller also works on the real DLR-Hand II, reaching a cycle time of 6ms.


Do Large Language Models Understand Verbal Indicators of Romantic Attraction?

arXiv.org Artificial Intelligence

What makes people 'click' on a first date and become mutually attracted to one another? While understanding and predicting the dynamics of romantic interactions used to be exclusive to human judgment, we show that Large Language Models (LLMs) can detect romantic attraction during brief getting-to-know-you interactions. Examining data from 964 speed dates, we show that ChatGPT (and Claude 3) can predict both objective and subjective indicators of speed dating success (r=0.12-0.23). ChatGPT's predictions of actual matching (i.e., the exchange of contact information) were not only on par with those of human judges who had access to the same information but incremental to speed daters' own predictions. While some of the variance in ChatGPT's predictions can be explained by common content dimensions (such as the valence of the conversations) the fact that there remains a substantial proportion of unexplained variance suggests that ChatGPT also picks up on conversational dynamics. In addition, ChatGPT's judgments showed substantial overlap with those made by the human observers (mean r=0.29), highlighting similarities in their representation of romantic attraction that is, partially, independent of accuracy.